Adding Fault-tolerant Transaction Processing to LINDA
نویسندگان
چکیده
To simplify the difficult task of writing fault-tolerant parallel software, we implemented extensions to the basic functionality of the LINDA or tuple-space programming model. Our approach implements a mechanism of transaction processing to ensure that tuples are properly handled in the event of a node or communications failure. If a process retrieving a tuple fails to complete processing or a tuple posting or retrieval message is lost, the system is automatically rolled back to a previous stable state. Processing failures and lost messages are detected by time-out alarms. Roll-back is accomplished by reposting pertinent tuples. Intermediate tuples produced during partial processing are not committed or made available until a process completes. In the absence of faults, system overhead is low. The fault-tolerance mechanism is implemented at the system level and requires little programmer effort or expertise. Two implementations of the model are discussed, one using a UNIX network of workstations and one using a Transputer network. Data measuring model overhead and some aspects of system performance in the presence of faults is presented for an example system.
منابع مشابه
Recovery with limited replay: fault-tolerant processes in Linda
Research in the area of fault-tolerant distributed systems has focused to a large extent on data surviving various forms of failure. The replica control algorithms for maintaining mutually consistent replicas abound in number. However, comparatively little work has been devoted to making processes recoverable. In domains other than databases and transaction processing, faulttolerance generally ...
متن کاملFault Tolerance Lessons Applied to Parallel Computing
This paper describes an approach to fault-tolerant parallel computing which is based on the experiences with the most successful fault-tolerant software – the transaction processing systems. The algorithms presented here have less runtime overhead and faster recovery than most preceding approaches. In the Pact parallel programming environment fault tolerance is provided fully user transparent i...
متن کاملModeling Fault-Tolerant and Reliable Mobile Agent Execution in Distributed Systems
The reliable execution of a mobile agent is a very important design issue in building a mobile agent system and many fault-tolerant schemes have been proposed so far. To further develop mobile agent technology, reliability mechanisms such as fault tolerance and transaction support are required. For this purpose, we first identify two basic requirements for fault-tolerant mobile agent execution:...
متن کاملModeling Fault-Tolerant and Secure Mobile Agent Execution
The reliable execution of a mobile agent is a very important design issue in building a mobile agent system and many fault-tolerant schemes have been proposed so far. Security is a major problem of mobile agent systems, especially when money transactions are concerned . Security for the partners involved is handled by encryption methods based on a public key authentication mechanism and by secr...
متن کاملPLinda 2.0: A Transactional/Checkpointing Approach to Fault Tolerant Linda
Robust parallel computation in Linda requires both tuple space and processes to be resilient to failure. In this paper, we present PLinda 2.0, set of extensions to Linda to support robust parallel computation on loosely coupled processors communicating over a network. The principal extensions of PLinda 2.0 to Linda are transaction mechanisms for reliable tuple space and process-private logging ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Softw., Pract. Exper.
دوره 24 شماره
صفحات -
تاریخ انتشار 1994